AITopics | inference mode

Collaborating Authors

inference mode

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning

Valadi, Viktor, Åkesson, Mattias, Östman, Johan, Toor, Salman, Hellander, Andreas

arXiv.org Artificial IntelligenceAug-28-2025

Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.

artificial intelligence, machine learning, statistics, (13 more...)

arXiv.org Artificial Intelligence

2508.19819

Country: Europe > Sweden (0.69)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

X-VFL: A New Vertical Federated Learning Framework with Cross Completion and Decision Subspace Alignment

Yao, Qinghua, Xu, Xiangrui, Li, Zhize

arXiv.org Artificial IntelligenceAug-11-2025

Vertical Federated Learning (VFL) enables collaborative learning by integrating disjoint feature subsets from multiple clients/parties. However, VFL typically faces two key challenges: i) the requirement for perfectly aligned data samples across all clients (missing features are not allowed); ii) the requirement for joint collaborative inference/prediction involving all clients (it does not support locally independent inference on a single client). To address these challenges, we propose X-VFL, a new VFL framework designed to deal with the non-aligned data samples with (partially) missing features and to support locally independent inference of new data samples for each client. In particular, we design two novel modules in X-VFL: Cross Completion (XCom) and Decision Subspace Alignment (DS-Align). XCom can complete/reconstruct missing features for non-aligned data samples by leveraging information from other clients. DS-Align aligns local features with completed and global features across all clients within the decision subspace, thus enabling locally independent inference at each client. Moreover, we provide convergence theorems for different algorithms used in training X-VFL, showing an $O(1/\sqrt{T})$ convergence rate for SGD-type algorithms and an $O(1/T)$ rate for PAGE-type algorithms, where $T$ denotes the number of training update steps. Extensive experiments on real-world datasets demonstrate that X-VFL significantly outperforms existing methods, e.g., achieving a 15% improvement in accuracy on the image CIFAR-10 dataset and a 43% improvement on the medical MIMIC-III dataset. These results validate the practical effectiveness and superiority of X-VFL, particularly in scenarios involving partially missing features and locally independent inference.

artificial intelligence, inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.05568

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Enhancing Predictive Maintenance in Mining Mobile Machinery through a TinyML-enabled Hierarchical Inference Network

de la Fuente, Raúl, Radrigan, Luciano, Morales, Anibal S

arXiv.org Artificial IntelligenceNov-16-2024

Mining machinery operating in variable environments faces high wear and unpredictable stress, challenging Predictive Maintenance (PdM). This paper introduces the Edge Sensor Network for Predictive Maintenance (ESN-PdM), a hierarchical inference framework across edge devices, gateways, and cloud services for real-time condition monitoring. The system dynamically adjusts inference locations--on-device, on-gateway, or on-cloud--based on trade-offs among accuracy, latency, and battery life, leveraging Tiny Machine Learning (TinyML) techniques for model optimization on resource-constrained devices. Performance evaluations showed that on-sensor and on-gateway inference modes achieved over 90\% classification accuracy, while cloud-based inference reached 99\%. On-sensor inference reduced power consumption by approximately 44\%, enabling up to 104 hours of operation. Latency was lowest for on-device inference (3.33 ms), increasing when offloading to the gateway (146.67 ms) or cloud (641.71 ms). The ESN-PdM framework provides a scalable, adaptive solution for reliable anomaly detection and PdM, crucial for maintaining machinery uptime in remote environments. By balancing accuracy, latency, and energy consumption, this approach advances PdM frameworks for industrial applications.

data mining, machine learning, real time system, (19 more...)

arXiv.org Artificial Intelligence

2411.07168

Country:

South America > Chile > Biobío Region > Concepción Province > Concepción (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Materials > Metals & Mining (1.00)
Information Technology > Services (1.00)
Energy (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Information Management (1.00)
(9 more...)

Add feedback

ProFuser: Progressive Fusion of Large Language Models

Shi, Tianyuan, Wan, Fanqi, Huang, Canbin, Quan, Xiaojun, Li, Chenliang, Yan, Ming, Zhang, Ji

arXiv.org Artificial IntelligenceAug-9-2024

While fusing the capacities and advantages of various large language models (LLMs) offers a pathway to construct more powerful and versatile models, a fundamental challenge is to properly select advantageous model during the training. Existing fusion methods primarily focus on the training mode that uses cross entropy on ground truth in a teacher-forcing setup to measure a model's advantage, which may provide limited insight towards model advantage. In this paper, we introduce a novel approach that enhances the fusion process by incorporating both the training and inference modes. Our method evaluates model advantage not only through cross entropy during training but also by considering inference outputs, providing a more comprehensive assessment. To combine the two modes effectively, we introduce ProFuser to progressively transition from inference mode to training mode. To validate ProFuser's effectiveness, we fused three models, including vicuna-7b-v1.5, Llama-2-7b-chat, and mpt-7b-8k-chat, and demonstrated the improved performance in knowledge, reasoning, and safety compared to baseline methods.

inference mode, source model, training mode, (17 more...)

arXiv.org Artificial Intelligence

2408.04998

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

Scherer, Moritz, Macan, Luka, Jung, Victor, Wiese, Philip, Bompani, Luca, Burrello, Alessio, Conti, Francesco, Benini, Luca

arXiv.org Artificial IntelligenceAug-8-2024

Despite many recent successes with previous-generation Deep The latest evolutions in mainstream Artificial Intelligence (AI) Neural Networks (DNNs), the emergence of the tinyML paradigm have been driven by Transformers, which have taken over from for EFMs faces the dual challenge of reducing FMs to a manageable Recurrent Neural Networks (RNNs) and Convolutional Neural size and enabling their deployment on tiny devices. Networks (CNNs) as the leading edge models for language A first concrete step in this direction is the recent introduction of processing and multi-modal applications [1], [2]. The success of Small Language Models (SLMs): FMs with tens to a few hundred Transformers can be primarily attributed to the emergence of the million, rather than several billion parameters [8], [9]. While Foundation Model (FM) paradigm: large Transformer models most currently available FMs are focused on processing natural extensively pre-trained on datasets spanning trillions of tokens and language at a proof-of-concept scale, the effort towards embedded then fine-tuned with a much lower volume of labeled data to solve multi-modal sensor inputs with small-scale, application-specific domain-specific problems. Following the success of FMs in Natural FMs offers a highly promising path for the development of this Language Processing (NLP) [1], [3], an increasing number of fields novel class of models.

constraint, deeploy, operator, (16 more...)

arXiv.org Artificial Intelligence

2408.04413

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sparsifying Spiking Networks through Local Rhythms

Olin-Ammentorp, Wilkie

arXiv.org Artificial IntelligenceApr-29-2023

It has been well-established that within conventional neural networks, many of the values produced at each layer are zero. In this work, I demonstrate that spiking neural networks can prevent the transmission of spikes representing values close to zero using local information. This can reduce the amount of energy required for communication and computation in these networks while preserving accuracy. Additionally, this demonstrates a novel application of biologically observed spiking rhythms.

artificial intelligence, machine learning, spike, (18 more...)

arXiv.org Artificial Intelligence

2305.10191

Country: North America > United States > Illinois > Cook County > Lemont (0.04)

Genre:

Research Report (0.51)
Overview (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Using Whisper (speech-to-text) and Tortoise (text-to-speech)

#artificialintelligenceOct-25-2022, 20:33:00 GMT

I’ll demonstrate how to extract an audio clip from YouTube, implement speech recognition using OpenAI’s Whisper, and perform speech generation using Tortoise to clone a custom voice.

audio file, tortoise, youtube video, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)

Add feedback

Clothes and color extraction with Generative Adversarial Network

#artificialintelligenceOct-3-2019, 16:36:54 GMT

So how can we solve this problem? I tried to replace the background of the original image with a solid color manually and realized that with this kind of input the model produces much better results. But how can this job be automated?

background, dataset, generative adversarial network, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Making AI FaaSt

#artificialintelligenceSep-5-2019, 03:07:34 GMT

Drascalita Haut: Today we're going to talk about functions, and in particular Functions as a Service. It applies it to AI in order to present a solution that seems to bring strategic advantages when deploying AI services at scale. During this session, it may feel like we're dancing a bit, moving through tools, new technologies, maybe you might even see some new steps like workflows or methods to work with AI. And for those of you that know salsa, you know that it starts with a step forward. So today, I'm going to start with some bold statements, but bear with us, I'm going to take a step back, and then me and AK are going to rehearse something through a live demo, which hopefully is going to go just fine, to illustrate what we're talking about. Let me start with a step forward, FaaS value prop. What does FaaS bring that more and more people are talking about? I came with three reasons. Number one is FaaSter to prototype, FaaSter to create services, because we work with code, with functions, just code, and we just push the code as it is. Second, never pay for idle. FaaS platforms have the capability to shut down the parts of the system that are not used, so we don't incur any cost. And the third one is a low maintenance overhead. That's because FaaS platforms usually take away the burden to create containers, keeping them up to date, apply security updates, auto-scaling the functions, deploy them in multiple regions. In other words, FaaS boldly claims that you will find it easier to build more services, and you're going to pay less. Now, this is a pretty bold statement, isn't it? So allow me to take a step back and look at how developers are producing microservices today. A few years ago, we realized that microservices are better than monoliths because in essence, they add flexibility, and they simplify the experience. At the same time, it's also less risky to independently update parts of the system. And I would assume that many of us know what microservices are. A very high-level microservice architecture is in this slide. So the final solution basically consists of isolated pieces, with its own independent deployment lifecycle. Now, microservices used to be deployed in their own VMs, and then containers came and it was such a revolution because we're able to correctly run multiple services in isolation in the same VM.

cloud computing, dascalita haut, machine learning, (18 more...)

#artificialintelligence

Genre: Workflow (0.50)

Industry: Information Technology (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback